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Abstract The ability to model search in a constraint solver can be an essential asset for solv- 
ing combinatorial problems. However, existing infrastructure for defining search heuristics 
is often inadequate. Either modeling capabilities are extremely limited or users are faced 
with a general-purpose programming language whose features are not tailored towards writ- 
ing search heuristics. As a result, major improvements in performance may remain unex- 
plored. 

This article introduces search combinators, a lightweight and solver-independent method 
that bridges the gap between a conceptually simple modeling language for search (high- 
level, functional and naturally compositional) and an efficient implementation (low-level, 
imperative and highly non-modular). By allowing the user to define application-tailored 
search strategies from a small set of primitives, search combinators effectively provide a 
<Z3 rich domain-specific language (DSL) for modeling search to the user. Remarkably, this DSL 

comes at a low implementation cost to the developer of a constraint solver. 

The article discusses two modular implementation approaches and shows, by empirical 
evaluation, that search combinators can be implemented without overhead compared to a 
native, direct implementation in a constraint solver. 
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1 Introduction 

Search heuristics often make all the difference between effectively solving a combinato- 
rial problem and utter failure. Heuristics make a search algorithm efficient for a variety of 
reasons, e.g., incorporation of domain knowledge, or randomization to avoid heavy-tailed 
runtimes. Hence, the ability to swiftly design search heuristics that are tailored towards a 
problem domain is essential for performance. This article introduces search combinators, a 
versatile, modular, and efficiently implementable language for expressing search heuristics. 



1.1 Status Quo 

In CP, much attention has been devoted to facilitating the modeling of combinatorial prob- 
lems. A range of high-level modeling languages, such as Zinc [ ], OPL [2 l >] or Comet [27], 
enable quick development and exploration of problem models. However, we see very little 
support on the side of formulating accompanying search heuristics. Most languages and 
systems, e.g. MiniZinc [ ], Comet [27], Gecode [23], or ECLiPSe [ ] provide a small 
set of predefined heuristics "off the shelf". Some systems also support user-defined search 
based on a general-purpose programming language (e.g., all of the above systems except 
MiniZinc). The former is clearly too confining, while the latter leaves to be desired in terms 
of productivity, since implementing a search heuristic quickly becomes a non-negligible ef- 
fort. This also explains why the set of predefined heuristics is typically small: it takes a lot of 
time for CP system developers to implement heuristics, too - time they would much rather 
spend otherwise improving their system. 



1.2 Contributions 

In this article we show how to resolve this stand-off between solver developers and users, 
by introducing a domain-specific modular search language based on combinators, as well as 
a modular, extensible implementation architecture. 

For the user, we provide a modeling language for expressing complex search heuristics 
based on an (extensible) set of primitive combinators. Even if the users are only pro- 
vided with a small set of combinators, they can already express a vast range of combi- 
nations. Moreover, using combinators to program application-tailored search is vastly 
more productive than resorting to a general-purpose language. 

For the system developer, we show how to design and implement modular combinators. 
The modularity of the language thus carries over directly to modularity of the imple- 
mentation. Developers do not have to cater explicitly for all possible combinator com- 
binations. Small implementation efforts result in providing the user with a lot of expres- 
sive power. Moreover, the cost of adding one more combinator is small, yet the return in 
terms of additional expressiveness can be quite large. 

The technical challenge is to bridge the gap between a conceptually simple search lan- 
guage and an efficient implementation, which is typically low-level, imperative and highly 
non-modular. This is where existing approaches fail; they restrict the expressiveness of their 
search specification language to face up to implementation limitations, or they raise errors 
when the user strays out of the implemented subset. 
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The contribution is therefore the novel design of an expressive, high-level, composi- 
tional search language with an equally modular, extensible, and efficient implementation 
architecture. 



1.3 Approach 

We overcome the modularity challenge by implementing the primitives of our search lan- 
guage as mixin components [ ]. As in Aspect-Oriented Programming [ ], mixin compo- 
nents neatly encapsulate the cross-cutting behavior of primitive search concepts, which are 
highly entangled in conventional approaches. Cross-cutting means that a mixin component 
can interfere with the behavior of its sub-components (in this case, sub-searches). The com- 
bination of encapsulation and cross-cutting behavior is essential for systematic reuse of 
search combinators. Without this degree of modularity, minor modifications require rewrit- 
ing from scratch. 

An added advantage of mixin components is extensibility. We can add new features to 
the language by adding more mixin components. The cost of adding such a new component 
is small, because it does not require changes to the existing ones. Moreover, experimental 
evaluation bears out that this modular approach has no significant overhead compared to the 
traditional monolithic approach. Finally, our approach is solver-independent and therefore 
makes search combinators a potential standard for designing search. 



1.4 Plan of the Article 

The rest of the article is structured as follows. The next section defines the high-level search 
language in terms of basic heuristics and combinators. Sect. 3 shows how the modular lan- 
guage is mapped to a modular design of the combinator implementations. Sect. 4 presents 
two concrete implementation approaches for combinators and gives an overview of how we 
integrate search combinators into the MiniZinc toolchain. Sect. 5 verifies that combinators 
can be implemented with low overhead. Finally, Sect. 6 discusses related approaches, and 
Sect. 7 concludes the article. 



1.5 Note to Reviewers 

This article is an extended version of a paper [ ] that appeared in the proceedings of the 
17th International Conference on Principles and Practice of Constraint Programming (CP) 
201 1. That paper further developed the ideas laid out in our earlier paper [ ], which was 
presented at ModRef 2010. 

Compared to the CP' 1 1 conference version, this article features a completely rewrit- 
ten and more detailed introduction of the combinator language, both from the high-level 
(Sect. 2) and the implementation-level (Sect. 3) point of view. We have completed the inte- 
gration into MiniZinc and present a full toolchain in Sect. 4.3. In addition, the discussion of 
related work (Sect. 6) has been extended with much more detail. 
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perform .vl, on termination start s2, ... 




initial value e, then perform s 




portfolio([.vi,i'2,...,.v„]) 




assign(v,e) 




perform .vl, if not exhaustive start si, . . . 




assign e to variable v and succeed 




restart(c, s) 




post(c,s) 




restart s as long as c holds 




post constraint c at every node during 


s 





Fig. 1: Catalog of primitive search heuristics and combinators 

2 High-Level Search Language 

This section introduces the syntax of our high-level search language and illustrates its ex- 
pressive power and modularity by means of examples. The rest of the article then presents 
an architecture that maps the modularity of the language down to the implementation level. 

The search language is used to define a search heuristic, which a search engine applies 
to each node of the search tree. For each node, the heuristic determines whether to continue 
search by creating child nodes, or to prune the tree at that node. The queuing strategy, i.e., the 
strategy by which new nodes are selected for further search (such as depth-first traversal), is 
determined separately by the search engine, it is thus orthogonal to the search language. The 
search language features a number of primitives, listed in the catalog of Fig. 1 . These are 
the building blocks in terms of which more complex heuristics can be defined, and they can 
be grouped into basic heuristics (basesearch and prune), combinators (ifthenelse, and, or, 
portfolio, and restart), and state management (let, assign, post). This section introduces the 
three groups of primitives in turn. 

We emphasize that this catalog is open-ended; we will see that the language implemen- 
tation explicitly supports adding new primitives. 

The concrete syntax we chose for presentation uses simple nested terms, which makes 
it compatible with the annotation language of MiniZinc [ ]. Sect. 4.3 discusses our imple- 
mentation of MiniZinc with combinator support. However, other concrete syntax forms are 
easily supported (e.g., we support C ++ and Haskell). 

2.1 Basic Heuristics 

Let us first discuss the two basic primitives, basesearch and prune. 

base_search. The most widely used method for specifying a basic heuristic for a constraint 
problem is to define it in terms of a variable selection strategy which picks the next variable 
to constrain, and a domain splitting strategy which splits the set of possible values of the 
selected variable into two (or more) disjoint sets. Common variable selection strategies are: 

- firstfail: select the variable with the smallest current domain, 

- smallest: select the variable which can take the smallest possible value, 

- domwdeg [2] : select the variable with smallest ratio of size of current domain and num- 
ber of failures the variable has been involved in, and 

- impact [ ] : select the variable that will (based on past experience) reduce the raw search 
space of the problem the most. 
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Common domain splitting strategies are: 

- min: set the variable to its minimum value or greater than its minimum, 

- max: set the variable to its maximum value or less than its maximum, 

- median: set the variable to its median value, or not equal to this value, and 

- split: constrain the variable to the lower half of its range of possible values, or its upper 
half. 

The CP community has spent a considerable amount of work on defining and exploring 
the above and many other variable selection and domain splitting heuristics. The provision 
of a flexible language for defining new basic searches is an interesting problem in its own 
right, but in this article we concentrate on search combinators that combine and modify 
basic searches. 

To this end, our search language provides the primitive base_search(vars, var-select, 
domain- split), which specifies a systematic search. If any of the variables vars are still not 
fixed at the current node, it creates child nodes according to var-select and domain-split as 
variable selection and domain splitting strategies respectively. 

Note that basesearch is a CP-specific primitive; other kinds of solvers provide their 
own search primitives. The rest of the search language is essentially solver-independent. 
While the solver provides few basic heuristics, the search language adds great expressive 
power by allowing these to be combined arbitrarily using combinators. 



prune. The second basic primitive, prune, simply cuts the search tree below the current 
node. Obviously, this primitive is useless on its own, but we will see shortly how prune can 
be used together with combinators. 



2.2 Combinators 

The expressive power of the search language relies on combinators, which combine search 
heuristics (which can be basic or themselves constructed using combinators) into more com- 
plex heuristics. 

and/or. Probably the most widely used combination of heuristics is sequential composition. 
For instance, it is often useful to first label one set of problem variables before starting to 
label a second set. The following heuristic uses the and combinator to first label all the xs 
variables using a first-fail strategy, followed by the ys variables with a different strategy: 

and ( [base_search (xs, firstfail, min) , 
basesearch (>'.?, smallest, max)]) 

As you can see in Fig. 1, the and combinator accepts a list of searches st,...,s n , and 
performs their and-sequential composition. And-sequential means, intuitively, that solutions 
are found by performing all the sub-searches sequentially down one branch of the search 
tree, as illustrated in Fig. 2.1. 

The dual combinator, or([si,...,s n \), performs a disjunctive combination of its sub- 
searches - a solution is found using any of the sub-searches (Fig. 2.2), trying them in the 
given order. 
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1) and([si,s 2 ]) 



3) if(c,si,s 2 ) 




5) restart(c, s) 



2) or([si,5 2 ]) 




4) portfolio([si, s 2 ,s 3 ]) 






Fig. 2: Primitive combinators 



Statistics and ifthenelse. The ifthenelse combinator is centered around a conditional ex- 
pression c. As long as c is true for the current node, the sub-search s\ is used. Once c is false, 
S2 is used for the complete subtree below the current node (see Fig. 2.3). 

We do not specify the expression language for conditions in detail, we simply assume 
that it comprises the typical arithmetic and comparison operators and literals that require 
no further explanation. It is notable though that the language can refer to the constraint 
variables and parameters of the underlying model. Additionally, a condition may refer to one 
or more statistics variables. Such statistics are collected for the duration of a subsearch until 
the condition is met. For instance ifthenelse (depth < I0,si,s2) maintains the search depth 
statistic during subsearch si. At depth 10, the ifthenelse combinator switches to subsearch 

Si- 

We distinguish two forms of statistics: Local statistics such as depth and discrepancies 
express properties of individual nodes. Global statistics such as number of explored nodes, 
encountered failures, solution, and time are computed for entire search trees. 

It is worthwhile to mention that developers (and advanced users) can also define their 
own statistics, just like combinators, to complement any predefined ones. In fact, Sect. 3 will 
show that statistics can be implemented as a subtype of combinators that can be queried for 
the statistic's value. 



Abstraction. Our search language draws its expressive power from the combination of 
primitive heuristics using combinators. An important aspect of the search language is ab- 
straction: the ability to create new combinators by effectively defining macros in terms of 
existing combinators. 

For example, we can define the limiting combinator limit(c, ,s-) to perform s while condi- 
tion c is satisfied, and otherwise cut the search tree using prune: 
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limit(c,i) = ifthenelse(c,.s, prune) 



The well-known once(j) combinator is a special case of the limiting combinator where the 
number of solutions is less than one. This is simply achieved by maintaining and accessing 
the solutions statistic: 



once(^) = limit(solutions < l,s) 



Exhaustiveness and portfolio/restart. The behavior of the final two combinators, portfo- 
lio and restart, depends on whether their sub-search was exhaustive. Exhaustiveness simply 
means that the search has explored the entire subtree without ever invoking the prune prim- 
itive. 

The portfolio([si, ... ,s n ]) combinator performs s i until it has explored the whole subtree. 
If si was exhaustive, i.e., if it did not call prune during the exploration of the subtree, the 
search is finished. Otherwise, it continues with portfolio)^,. • • ,s„]). This is illustrated in 
Fig. 2.4, where the subtree of s\ represents a non-exhaustive search, s% is exhaustive and 
therefore s$ is never invoked. 

An example for the use of portfolio is the hotStart(c, Si,Cz) combinator. It performs 
search heuristic s\ while condition c holds to initialize global parameters for a second search 
*2. This heuristic can for example be used to initialize the widely applied Impact heuristic 
[18]. Note that we assume here that the parameters to be initialized are maintained by the 
underlying solver, so we omit an explicit reference to them. 

hotstart(c, s\,S2) = portfolio([limit(c,5 1 ), j 2 ]) 



The restart(c, s) combinator repeatedly runs s in full. If s was not exhaustive, it is 
restarted, until condition c no longer holds. Fig. 2.5 shows the two cases, on the left ter- 
minating with an exhaustive search s, on the right terminating because c is no longer true. 

The following implements random restarts, where search is stopped after 1000 failures 
and restarted with a random strategy: 



restart(true, limit(failures < 1000,base_search(xs,randomvar,randomval))) 



Clearly, this strategy has a flaw: If it takes more than 1000 failures to find the solution, 
the search will never finish. We will shortly see how to fix this by introducing user-defined 
search variables. 

The prune primitive is the only source of non-exhaustiveness. Combinators propagate 
exhaustiveness in the obvious way: 

- and([si, . . . ,S n ]) is exhaustive if all Si are 

- or([«i,. . .,vl) lS exhaustive if all s, are 

- portfolio([ii,. •■,««]) is exhaustive if one St is 

- restart(c,.s) is exhaustive if the last iteration is exhaustive 

- ifthenelse(c, ^1,^2) is exhaustive if si and S2 are 



2.3 State Access and Manipulation 

The remaining three primitives, let, assign, and post, are used to access and manipulate the 
state of the search: 
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- Iet(v, e, s) introduces a new search variable v with initial value of the expression e and 
visible in the search s, then continues with s. Note that search variables are distinct from 
the decision variables of the model. 

- assign (v,e): assigns the value of the expression e to search variable v and succeeds. 

- post(c.s): provides access to the underlying constraint solver, posting a constraint c at 
every node during s. If s is omitted, it posts the constraint and immediately succeeds. 

These primitives add a great deal of expressivity to the language, as the following ex- 
amples demonstrate. 

Random restarts: Let us reconsider the example using random restarts from the previous 
section, which suffered from incompleteness because it only ever explored 1000 failures. A 
standard way to make this strategy complete is to increase the limit geometrically with each 
iteration: 



geomrestart(i) = \eX(maxfails, 100, 

restart(true,portfolio([limit(failures < maxfails 7 s), 

ass\gn(maxfails, maxfails * 1 .5) , 
prune])) 

The search initializes the search variable maxfails to 100, and then calls search s with 
maxfails as the limit. If the search is exhaustive, both the portfolio and the restart combi- 
nators are finished. If the search is not exhaustive, the limit is multiplied by 1.5, and the 
search starts over. Note that assign succeeds, so we need to call prune afterwards in order 
to propagate the non-exhaustiveness of s to the restart combinator. 

Branch-and-bound: A slightly more advanced example is the branch-and-bound optimiza- 
tion strategy: 

bab(obj,s) = \el(best,°°, post(ofc/' < best,an6([s,ass'\gn(best,obj)]))) 



It introduces a variable best that initially takes value °° (for minimization). In every node, it 
posts a constraint to bound the objective variable by best. Whenever a new solution is found, 
the bound is updated accordingly using assign. 

The bab example demonstrates how search variables (like best) and model variables 
(like obf) can be mixed in expressions. This makes it possible to remember the state of the 
search between invocations of a heuristic. All of the following combinators make use of this 
feature. 

Restarting branch-and-bound: This is a twist on regular branch-and-bound that restarts 
whenever a solution is found. 



restart_bab(o/?/» = \e\(best,°°, restart(frae, and([pos\(obj < best),once(s) 

ass\gn(be st, obf)]))) 



Radiotherapy treatment planning: The following search heuristic can be used to solve ra- 
diotherapy treatment planning problems [ ]. The heuristic minimizes a variable k using 
branch-and-bound (bab), first searching the variables N, and then verifying the solution 
by partitioning the problem along the row-, variables for each row i one at a time (expressed 
as a MiniZinc array comprehension). Failure on one row must be caused by the search on 
the variables in N, and consequently search never backtracks into other rows. 
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This behavior is similar to the once combinator defined above. However, when a single 
solution is found, the search should be considered exhaustive. We therefore need an exhaus- 
tive variant of once, which can be implemented by replacing prune with pos\(false): 



exh_once(j) = ifthenelse(solutions < l,s,post(false)) 



This allows us to express the entire search strategy for radiotherapy treatment planning: 



bab(k, and([base_search(N, . . . )]++ 

[exh_once(base_search(row,,...)) | /in !..«])) 



For: The for loop construct (v e [/, «]) can be defined as: 



for(v, l,u,s) = let(v,/,restart(v< u, 

portfolio([j,and([assign(v,v+l), prune])]))) 



It simply runs u — I + 1 times the search s, which of course is only sensible if s makes 
use of side effects or the loop variable v. As in the geomrestart combinator above, prune 
propagates the non-exhaustiveness of s to the restart combinator. 

Limited discrepancy search [4] with an upper limit of / discrepancies for an underlying 
search s. 



Ids(/,.s) = for(rc, 0, /, limit(discrepancies < n,s)) 



The for construct iterates the maximum number of discrepancies n from to /, while limit 
executes s as long as the number of discrepancies is smaller than n. The search makes use 
of the discrepancies statistic that is maintained by the search infrastructure. The original 
LDS [ ] visits the nodes in a specific order. The search described here visits the same nodes 
in the same order of discrepancies, but possibly in a different individual order - as this is 
determined by the global queuing strategy. 

The following is a combination of branch-and-bound and limited discrepancy search 
for solving job shop scheduling problems, as described in [ ]. The heuristic searches the 
Boolean variables prec, which determine the order of all pairs of tasks on the same machine. 
As the order completely determines the schedule, we then fix the start times using exhonce. 



bab(makespan, lds(°°, and ([base_search {prec, ...), 

exh_once(base_search(jtart, . . . ))]))) 



Fully expanded, this heuristic consists of 17 combinators and is 1 1 combinators deep. 

Iterative deepening [11] for an underlying search 5 is a particular instance of the more 
general pattern of restarting with an updated bound, which we have already seen in the 
geomrestart example. Here, we generalize this idea: 



\6(s) = ir(depth,0,+, l,o«,s) 

\r(p, I, ©, i, u, s) = let(n,Z, restart(« < «,and([assign(«, «©;'), 

limit(p < «,*)]))) 



With let, bound n is initialized to /. Search s is pruned when statistic p exceeds n, but itera- 
tively restarted by restart with n updated to n © ;'. The repetition stops when n exceeds u or 
when s has been fully explored. The bound increases geometrically, if we supply * for ©, as 
in the restart flip heuristic: 



restart_flip(p,Z, i,u,s\,S2) =\e\(flip, l,\r(p,l, *,i,u, and([assign(/?ip, 1 —flip), 

ifthenelse(/?j> = Mi, $2)]))) 
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The restart flip search alternates between two search heuristics s\ and S2- Using this as its 

default strategy in Ihe free search category, the lazy clause generation solver Chuffed scored 
most points in the 2010 and 201 1 MiniZinc Challenges. 1 

Probe search: Try out two searches s\ and S2 to a limited extent defined by condition c. 
Then, for the remainder, use the search that resulted in the best solution so far. 



probe(c,obj,si,s2) — \e\(besh,°°, \e\.(best2,°°, 

portfolio([ limit(c,and([si,assign(fce^i,o&7')])) 
Wm\X(c,and([s2,ass\gn(best2,obj)])) 
\fthene\se(besti < bestz,si,S2)]))) 



Dichotomic search [ ] solves an optimization problem by repeatedly partitioning the in- 
terval in which the possible optimal solution can lie. It can be implemented by restarting as 
long the lower bound has not met the upper bound (line 2), computing the middle (line 3), 
and then using an or combinator to try the lower half (line 5). If it succeeds, obj — 1 is the 
new upper bound, otherwise, the lower bound is increased (line 6). 



dicho(i, obj, lb, ub) =\eX(l,lb, let(«, ub, 
restart(/ < u, 
\et{h,l+\(u-l)/2], 
once(or([ 
and([post(7 < obj < h),s,ass\gn(u,obj- 1) 
and ( [assign (/, ft + 1), prune])])) 
)))) 



3 Modular Combinator Design 

The previous section caters for the user's needs, presenting a high-level modular syntax 
for our combinator-based search language. To cater for the system developer's needs, this 
section goes beyond modularity of syntax, introducing modularity of design. 

Modularity of design is the one property that makes our approach practical. Each combina- 
tor corresponds to a separate module that has a meaning and an implementation independent 
of the other combinators. This enables us to actually realize the search specifications defined 
by modular syntax. 

Modularity of design also enables growing a system from a small set of combinators 
(e.g., those listed in Fig. 1), gradually adding more as the need arises. Advanced users can 
complement the system's generic combinators with a few application-specific ones. 

Solver independence is another notable property of our approach. While a few combinators 
access solver-specific functionality (e.g., basesearch and post), the approach as such and 
most combinators listed in Fig. 1 are in fact generic (solver- and even CP-independent); their 
design and implementation is reusable. 

The solver-independence of our approach is reflected in the minimal interface that solvers 
must implement. This interface consists of an abstract type State which represents a state 
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start(r) enter(p) 



for every child c 



combinator 
1 



combinator 
2 



O 
o 
o 



combinator 
n-1 



combinator 




Fig. 3: The modular message protocol 

of the solver (e.g., the variable domains and accumulated constraint propagators) which sup- 
ports copying. Truly no more is needed for the approach or all of the primitive combinators 
in Fig. 1, except for basesearch and post which require CP-aware operations for querying 
variable domains, solver status and posting constraints, and possibly interacting with statis- 
tics maintained by the solver. Note that there need not be a 1-to-l correspondence between 
an implementation of the abstract State type and the solver's actual state representation; 
e.g., for solvers based on trailing, recomputation techniques [ ] can be used. We have im- 
plementations of the interface based on both copying and trailing. 

In the following we explain our design in detail by means of code implementations of most 
of the primitive combinators we have covered in the previous section. 



3.1 The Message Protocol 

To obtain a modular design of search combinators we step away from the idea that the be- 
havior of a search combinator, like the and combinator, forms an indivisible whole; this 
leaves no room for interaction. The key insight here is that we must identify finer-grained 
steps, defining how different combinators interact at each node in the search tree. Interleav- 
ing these finer-grained steps of different combinators in an appropriate manner yields the 
composite behavior of the overall search heuristic, where each combinator is able to cross- 
cut the others' behavior. 

Considering the diversity of combinators and the fact that not all units of behavior are ex- 
plicitly present in all of them, designing this protocol of interaction is non-trivial. It requires 
studying the intended behavior and interaction of combinators to isolate the fine-grained 
units of behavior and the manner of interaction. The contribution of this section is an ele- 
gant and conceptually uniform design that is powerful enough to express all the combinators 
presented in this article. 

We present this design in the form of a message protocol. The protocol specifies a set 
of messages (i.e., an interface with one procedure for each fine-grained step) that have to be 
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implemented by all combinators. In pseudo-code, this protocol for combinators consists of 
four different messages: 

protocol combinator 

start (rootNode) ; 

enter (cur rentNode) ; 

exit (cur rentNode, status) ; 

init (pa rent Node, childNode) ; 

All of the message signatures specify one or two search tree nodes as parameters. Each 
such node keeps track of a solver State and the information associated by combinators to 
that State. We observe three different access patterns of nodes: 

1 . In keeping with the solver independence stipulated above, we will see that most combi- 
nators only query and update their associated information and do not access the under- 
lying solver State at all. 

2. Restarting-based combinators, like restart and portfolio, copy nodes. This means 
copying the underlying solver State and all associated information. 

3. Finally, selected solver-specific combinators like base_search do perform solver- 
specific operations on the underlying State, like querying variable domains and post- 
ing constraints. 

In addition to the message signatures, the protocol also stipulates in what order the 
messages are sent among the combinators (see Fig. 3). While in general a combinator com- 
position is tree-shaped, the processing of any single search tree node p only involves a stack 
of combinators. For example, given or([and([.si,.S2]),arid([s3,S4])]), either si, S2 or s^,S4 are 
active for p. The picture shows this stack of active combinators on the left. Every combina- 
tor in the stack has both a ™/?er-combinator above and a ™fo-combinator below, except for 
the top and the bottom combinators. The bottom is always a basic heuristic (basesearch, 
prune, assign, or post). The important aspect to take away from the picture is the direction 
of the four different messages, either top-down or bottom-up. 



3.2 Basic Setup 

Before we delve into the interesting search combinators, we first present an example imple- 
mentation of the basic setup consisting of a base search (basesearch) and a search engine 
(dfs). This allows us to express overall search specifications of the form: 
dfs ( basesearch (vars, var-select, domain-split) ) . 

Base Search. We do not provide full details on a basesearch combinator, as it is not the 
focus of this article. However, we will point out the aspects relevant to our protocol. 

In the enter message, the node's solver state is propagated. Subsequently, the condi- 
tion isLeaf (c, vars) checks whether the solver state is unsatisfiable or there are no more 
variables to assign. If either is the case, the exit status (respectively failure or success) 
is sent to the parent combinator. For now, the parent combinator is just the search en- 
gine, but later we will see how how other combinators can be inserted between the search 
engine and the base search. 

If neither is the case, the search branches depending on the variable selection and domain 
splitting strategies. This involves creating a child node for each branch, determining the 
variable and value for that child and posting the assignment to the child's state. Then, the 
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top combinator (i.e., the engine) is asked to initialize the child node. Finally the child node 
is pushed onto the search queue. 



combinator basesearch (vars, var-se 


lect, domain-se 


lect) 




enter (c) : 












c .propagate 












if isLeaf (c, vars) 












parent. exit (leafstatus (c) ) 












pos = ... // from vars 


based 


on 


var- 


-selec 


t 


for each child: // based on 


domain 


-select 






val = ... // from values of 


jar 


based on 


domain- select 


child. post (vars [pos] =val) 












top. init (c, child) 












queue .push (child) 













Note that, as the basesearch combinator is a base combinator, its exit message is im- 
material (there is no child heuristic of basesearch that could ever call it). The start and 
init messages are empty. Many variants on and generalizations of the above implementa- 
tion are possible. 

Depth-first search engine. The engine dfs serves as a pseudo-combinator at the top of a 
combinator expression heuristic and serves as the heuristic's immediate parent as 
well. It maintains the queue of nodes, a stack in this case. The search start s from a given 
root node by starting the heuristic with that node and then enter ing it. Each time 
a node has been processed, new nodes may have been pushed onto the queue. These are 
popped and enter ed successively. 



combinator dfs(heurist 


ic) 


start (root) 






top=this 






heuristic 


parent= 


this 


queue=new 


stack ( ) 




heuristic 


start (root) 


heuristic 


enter (root) 


while not 


queue . e 


mpty 


heuristic. enter 


(queue. pop ( ) ) 


init (n, c) : 






heuristic 


init (n, 


c) 



The engine's exit message is empty, the enter message is never called and the init 
message delegates initialization to the heuristic. 

Other engines may be formulated with different queuing strategies. 



3.3 Combinator Composition 

The idea of search combinators is to augment a basesearch. We illustrate this with a very 
simple print combinator that prints out every solution as it is found. For simplicity we assume 
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a solution is just a set of constraint variables vars that is supplied as a parameter. Hence, we 
obtain the basic search setup with solution printing with: 

dfs (print (vars, base_search {vars, strategy) ) ) 

Print. The print combinator is parametrized by a set of variables vars and a search combi- 
nator child. Implicitly, in a composition, that child's parent is set to the print instance. 
The same holds for all following search combinators with one or more children. 

The only message of interest for print is exit . When the exit status is success, the 
combinator prints the variables and propagates the message to its parent. 



combinator print (vars, child) 
exit (c, status) : 

if status==success 

print c.vars 
parent . exit (c, status) 

The other messages are omitted. Their behavior is default: they all propagate to the child. 
The same holds for the omitted messages of following unary combinators. 



3.4 Binary Combinators 

Binary combinators are one step up from unary ones. They combine two complete search 
heuristics into a composite one. The most basic binary combinator is the binary version of 
and. For instance, if we need to label two sets of variables, we can do so with 

and (base_search (vars\ ,...), base_search (vars 2 , . . . ) ) 

The principle shown here easily generalizes to w-ary combinators. 

And. The (binary) and combinator has two children, left and right. In order to keep 
track of what child combinator is handling a particular node, the and combinator associates 
with every node an inLef t Boolean variable. The local keyword indicates that every 
node has its own instance of that variable. We denote the instance of the inLef t variable 
associated with node c as c . inLef t. 

When enter ing a node, it is delegated to the left or right combinator based on 
inLef t. At the start , the root node is delegated to the left combinator, so its inLef t 
variable is set to true. The value of inLeft is inherited in init from the current node 
to its children. Upon a successful exit for left, the leaf node becomes the root of a new 
subtree that is further handled by the right combinator. 



combinator and (left, right) { 
local bool inLeft 

start (root) : 

root . inLef t=true 
left . start (root) 

enter (c) : 

if c. inLeft 
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left . enter (c) 
else 

right . enter ( c ) 

exit (c, status) : 

if c.inLeft and status==success 

c . inLef t=false 

right . start (c) 

right . enter (c) 
else 

parent . exit (c, status) 

init (p, c) : 

c . inLef t=p . inLef t 
if c.inLeft 

left . init (p, c) 
else 

right . init ( p , c ) 

Note that the right combinator is start ed repeatedly, once for each leaf node of 
left. In general, each combinator can be managing multiple subtrees of the search. 

Multiple and combinators may be handling a search node at the same time. For instance 
in a heuristic of the form and(ar\6(si,S2),Si), two and combinators are active at the same 
time. The scoping of the associated variables works in such a way that each and has its own 
instance of inLef t for each node. 



3.5 Reusable Combinators 

Now we show how a monolithic combinator can be decomposed into more primitive com- 
binators that can be reused for other purposes. 

Monolithic Combinator. We start from the following limitsolutions combinator that prunes 
the search after cutoff solutions have been found. One new concept is the notion of a 
global variable associated with a (sub)tree: all descendants of root (implicitly) share the 
same instance of count. Hence, any update of count by one node is seen by all other 
nodes in the (sub)tree. 



combinator limitsolutions(cutof f , child) 
global int count 

start (root) : 

root . count = 
child. start (root) 

enter (c) : 

if count == cutoff 

parent . exit (abort) 
else 
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Fig. 4: The decomposition of the limitsolutions combinator 



child. enter (c) 

exit (c, status) : 

if status==success 

c . count++ 
parent . exit (o, status) 



Decomposition. We can split up the above limitsolutions combinator into three different 
combinators: ifthenelse, solutionslimit and prune. They form a directed acyclic graph as 
depicted in Figure 4 or denoted as an expression with sharing below: 



where 



limitsolutions (cutoff , s) = ifthenelse (s' , s' , prune) 



s' = solutionslimit (cutoff , s) 



Here, solutionslimit monitors the number of solutions produced by s. If this number reaches 
the cutoff, then ifthenelse switches to prune, which discards the remaining nodes in the 
tree. 

We now elaborate on each of these combinators individually. 



Prune. The prune combinator is a minimal base combinator that immediately exits every 
node with the abort status. The start message is empty, and the exit and init mes- 
sages are never called. 



combinator 


prune () 


enter (c) 




parent 


exit (c, abort) 



Solutions Count. The solutionslimit combinator below illustrates how statistics gathering 
combinators are implemented. It implements a sub-protocol of combinator with an extra 
message eval that queries the current Boolean value: 

protocol condition extends combinator 
eval (currentNode) ; 
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In the case of solutionslimit, the returned Boolean value is whether a particular number 
(cutoff) of solutions has not yet been reached by its child. For this purpose it maintains 
the number of solutions found so far in a global variable. 



condition solutionslimit (cutoff, child) 
global int count 

start (root) : 

root . count = 
child. start (root) 

exit (c, status) : 

if status==success 

c . count++ 
parent . exit (c, status) 

eval (c) : 

return c . count <= cutoff 



Ifthenelse. The ifthenelse combinator is parametrized by one condition and two child 
combinators. It associates with every node whether it is handled by the left child (inLeft); 
this is the case for the root node. Whenever a node c is entered that is inLeft, the condi- 
tion is checked. If the condition fails, c becomes the root of a subtree that is further handled 
by right. 



combinator ifthenelse (cond, left, right) 
local bool inLeft 

start (root) : 

root . inLef t=true 
left . start (root) 

enter (c) : 

if not c. inLeft 

right . enter (c) 
else if cond. eval ( ) 

left . enter (c) 
else 

c . inLef t=false 

right . start (c) 

right . enter (c) 

init (p, c) : 

c . inLef t=p . inLeft 
if c. inLeft 

left . init (p, c) 
else 

right . init ( p , c ) 
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3.6 Restarting Combinators 

Restarting the search is common to several combinators; the mechanic is illustrated below 
in the portfolio combinator. 

Portfolio. Like the ifthenelse and and combinators, the portfolio combinator switches be- 
tween child combinators. Only the logic for switching is more complex. In order to simplify 
presentation, we again restrict the code to the binary case; the n-ary variant is a straightfor- 
ward generalization. 

Firstly, portfolio keeps track of a global "reference" count ref of unprocessed nodes 
to be handled by the si child. This count is incremented whenever a new child node is 
initialized, and decremented whenever a node is entered for actual processing. 

When the last node of s 1 exits (witnessed by the reference count being 0) and the search 
was not exhaustive, the search starts over from the root, but now with the s2 child. In order 
to decide about exhaustiveness, the portfolio combinator registers whether any exit with 
status abort occurred. At the same time it converts an abort inside si into a failure, 
because the s 2 combinator may still perform an exhaustive search and avoid overall non- 
exhaustiveness. In order to restart from the root, a copy of the root node is made at the 
start . 

Upon a successful exit , the leaf node becomes the root of a new subtree that is further 
handled by the s 2 combinator. 

combinator portfolio! si, s2) 
global node copy 
global bool inLeft 
global bool exhaustive 
global int ref 

start (root) : 

copy=root . copy ( ) 
root . inLef t=true 
root . exhaustive=true 
root . ref=l 
si . start (root) 

enter (c) : 

if c. inLeft 

ref — 

si .enter (c) 
else 

s2 . enter (c) 

exit (c, status) : 
if not c. inLeft 

parent . exit (o, status) 
else 

if status==abort 
status=failure 
c . exhaustive=f alse 
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if c.ref==0 


if c. exhaustive 


pa rent. ex it (o, status) 


else 


copy . inLef t=false 


s2 . start (copy) 


self. enter (copy) 


else 


parent. exit (c, status) 


init (p, c) : 


ref++; 


if c. inLef t 


si . init (p, c) 


else 


s2 . init (p, c) 



4 Modular Combinator Implementation 

The message-based combinator approach lends itself well to different implementation strate- 
gies. In the following we briefly discuss two diametrically opposed approaches we have 
explored: dynamic composition (interpretation) and static composition (compilation). Us- 
ing these different approaches, combinators can be adapted to the implementation choices 
of existing solvers. Sect. 5 shows that both implementation approaches have competitive 
performance. 



4. 1 Dynamic Composition 

To support dynamic composition, we have implemented our combinators as C ++ classes 
whose objects can be allocated and composed into a search specification at runtime. The 
protocol events correspond to virtual method calls between these objects. For the delegation 
mechanism from one object to another, we explicitly encode a form of dynamic inheritance 
called open recursion or mixin inheritance [ ]. In contrast to the OOP inheritance built into 
C++ and Java, this mixin inheritance provides two essential abilities: 1) to determine the in- 
heritance graph at runtime and 2) to use multiple copies of the same combinator class at 
different points in the inheritance graph. In contrast, C ++ 's built-in static inheritance provides 
neither. 

The C++ library currently builds on top of the Gecode constraint solver [ ] . However, the 
solver is accessed through a layer of abstraction that is easily adapted to other solvers (e.g., 
we have a prototype interface to the Gurobi MIP solver). The complete library weighs in at 
around 2500 lines of code, which is even less than Gecode's native search and branching 
components. 
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4.2 Static Composition 

In a second approach, also on top of Gecode, we statically compile a search specification to 
a tight C++ loop. Again, every combinator is a separate module independent of other com- 
binator modules. A combinator module now does not directly implement the combinator's 
behavior. Instead it implements a code generator (in Haskell), which in turn produces the 
C++ code with the expected behavior. 

Hence, our search language compiler parses a search specification, and composes (in 
mixin-style) the corresponding code generators. Then it runs the composite code generator 
according to the message protocol. The code generators produce appropriate C++ code frag- 
ments for the different messages, which are combined according to the protocol into the 
monolithic C++ loop. This C++ code is further post-processed by the C++ compiler to yield a 
highly optimized executable. 

As for dynamic composition, the mixin approach is crucial, allowing us to add more 
combinators without touching the existing ones. At the same time we obtain with the press 
of a button several 1000 lines of custom low-level code for the composition of just a few 
combinators. In contrast, the development cost of hand crafted code is prohibitive. 

A compromise between the above two approaches, itself static, is to employ the built-in 
mixin mechanism (also called traits) available in object-oriented languages such as Scala [ ] 
to compose combinators. A dynamic alternative is to generate the combinator implemen- 
tations using dynamic compilation techniques, for instance using the LLVM (Low Level 
Virtual Machine) framework. These options remain to be explored. 

4.3 MiniZinc with Combinators 

As a proof of concept and platform for experiments, we have integrated search combinators 
into a complete MiniZinc toolchain, comprising a pre-compiler and a FlatZinc interpreter. 

The pre-compiler is necessary to support arbitrary expressions in annotations, such as 
the condition expressions for an ifthenelse. The expressions are translated into standard 
MiniZinc annotations that are understood by the FlatZinc interpreter. User-defined variables 
have type-inst svar int and can be introduced using the standard MiniZinc let construct. 
The annotation construct of MiniZinc has been extended to support simple function 
definitions. The following example shows a MiniZinc version of the restart-based branch- 
and-bound heuristic from Sect. 2.3: 

annotation limit (var bool : c, ann: s) = 
ifthenelse (c, s, prune) ; 

annotation once (ann: s) = limit (solutions < 1, s); 

annotation rbab (var int: obj, ann: s) = 
let { svar int: best = MAXINT } in 
restart (true, and ( [ 

post (obj < best) , 

once (s) , 

assign (best, obj ) ] ) ) ; 

solve : : rbab (x, int_search (y, input_order, assign_lb) ) satisfy; 
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The pre-compiler translates this code as follows: 

solve :: sh_let (sh_letvar ( "best " ) , sh_int (MAXINT) , 
sh_restart (sh_cond_true, sh_and ( [ 

sh_post_succeed (sh_cond_lt (sh_intvar (objective) , 

sh_letvar ("best") ) ) , 
sh_let (sh_letvar ( "solutioncount" ) , 0, 

sh_if thenelse ( sh_cond_lt (sh_letvar ( "solutioncount " ) , 

sh_int (1) ) , 
sh_solutioncount (sh_letvar ("solutioncount" ) , 
sh_int_search (x, sh_var_input_order , 
sh_val_assign_lb) ) , 
sh_prune) ) , 
sh_assign (sh_letvar ( "best " ) , sh_intvar (objective) ) ] ) ) ) 
satisfy; 

All literals are quoted (e.g. sh_int ( 1 ) ), user-defined search variables are turned into 
quoted strings (lv ( "best" ) ), expressions like ob j < best are translated into annota- 
tion terms (sh_cond_lt . . . ), and statistics are made explicit, introducing search variables 
and special combinators (sh_solutioncount). The result of the pre-compilation is valid, 
well-typed MiniZinc, which is then passed through the standard mzn2f zn translator to pro- 
duce FlatZinc ready for solving. We intend to incorporate the translations done by the pre- 
compiler into the standard mzn2 f zn in the future. 

We extended the Gecode FlatZinc interpreter to parse the search combinator annota- 
tion and construct the corresponding heuristic using the Dynamic Composition approach 
described above. The three tools, pre-compiler, mzn2f zn, and the modified FlatZinc inter- 
preter thus form a complete toolchain for solving MiniZinc models using search combina- 
tors. The source code including examples can be downloaded from 
http : //www . ge code. org/fl at zinc. html. 



5 Experiments 

This section evaluates the performance of our two implementations. It establishes that a 
search heuristic specified using combinators is competitive with a custom implementation 
of the same heuristic, exploring exactly the same tree. 

Sect. 3.1 introduced a message protocol that defines the communication between the 
different combinators for one node of the search tree. Any overhead of a combinator-based 
implementation must therefore come from the processing of each node using this protocol. 
All combinators discussed earlier process each message of the protocol in constant time 
(except for the basesearch combinators, of course). Hence, we expect at most a constant 
overhead per node compared to a native implementation of the heuristic. 

In the following, two sets of experiments confirm this expectation. The first set consists 
of artificial benchmarks designed to expose the overhead per node. The second set consists 
of realistic combinatorial problems with complex search strategies. 

The experiments were run on a 2.26 GHz Intel Core 2 Duo running Mac OS X. The 
results are the averages of 10 runs, with a coefficient of deviation less than 1.5%. 

Stress Test. The first set of experiments measures the overhead of calling a single combina- 
tor during search. We ran a complete search of a tree generated by 7 variables with domain 
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{0, . . . , 6} and no constraints (1 647 085 nodes). To measure the overhead, we constructed a 
basic search heuristic s and a stack of n combinators: 

portfolio([portfolio([...portfolio([s, prune])..., prune]), prune]) 
where n ranges from to 20 (realistic combinator stacks, such as those from the examples 
in this article, are usually not deeper than 10). The numbers in the following table report 
the runtime with respect to using the plain heuristic s, for both the static and the dynamic 
approach: 



n 


1 


2 


5 


10 


20 


static % 
dynamic % 


106.6 
107.3 


107.7 
117.6 


112.0 
145.2 


148.3 
192.6 


157.5 
260.9 



A single combinator generates an overhead of around 7%, and 10 combinators add 50% for 
the static and 90% for the dynamic approach. In absolute runtime, however, this translates 
to an overhead of around 17 ms (70 ms) per million nodes and combinator for the static 
(dynamic) approach. Note that this is a worst-case experiment, since there is no constraint 
propagation and almost all the time is spent in the combinators. 

Benchmarks. The second set of experiments shows that in practice, this overhead is dwarfed 
by the cost of constraint propagation and backtracking. Note that the experiments are not 
supposed to demonstrate the best possible search heuristics for the given problems, but that 
a search heuristic implemented using combinators is just as efficient as a native implemen- 
tation. 

Fig. 5 compares Gecode's optimization search engines with branch-and-bound imple- 
mented using combinators. On the well-known Golomb Rulers problem, both dynamic com- 
binators and native Gecode are slightly slower than static combinators. Native Gecode uses 
dynamically combined search heuristics, but is much less expressive. That is why the static 
approach with its specialization yields better results. 

On the radiotherapy problem (see Sect. 2.3), the dynamic combinators show an over- 
head of 6-11%. For native Gecode, exhonce must be implemented as a nested search, 
which performs similarly to the dynamic combinators. However, in instances 5 and 6, the 
compiled combinators lose their advantage over native Gecode. This is due to the processing 
of exhonce: As soon as it is finished, the combinator approach processes all nodes of the 
exhonce tree that are still in the search queue, which are now pruned by exhonce. The na- 
tive Gecode implementation simply discards the tree. We will investigate how to incorporate 
this optimization into the combinator approach. 

The job shop scheduling examples, using the combination of branch-and-bound and 
discrepancy limit discussed in Sect. 2.3, show similar behavior. In ABZ1-5 and mtlO, the 
interpreted combinators show much less overhead than in the short-running instances. This 
is due to more expensive propagation and backtracking in these instances, reducing the 
relative overhead of executing the combinators. 

In summary, the experiments show that the expressiveness and flexibility of a rich 
combinator-based search language can be achieved without any runtime overhead in the 
case of the static approach, and little overhead for the dynamic version. 



6 Related Work 

This section explores and discusses previous work that is closely related to search combina- 
tors as presented in this article. 
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101.8% 


102.5% 


Golomb 11 


12.72 s 


102.9% 


101.8% 


Golomb 12 


125.40 s 


100.6% 


101.9% 


Radiotherapy 1 


71.13 s 


105.9% 


107.3% 


Radiotherapy 2 


6.22 s 


1 10.9% 


110.1% 


Radiotherapy 3 


11.78 s 


108.3% 


108.1% 


Radiotherapy 4 


16.44 s 


107.5% 


106.9% 


Radiotherapy 5 


69.89 s 


108.1% 


98.7% 


Radiotherapy 6 


106.04 s 


109.2% 


99.1% 


Job Shop G2 


7.25 s 


146.3% 


101.2% 


Job-Shop G4 


6.96 s 


164.0% 


107.75% 


Job-Shop HI 


38.05 s 


153.1% 


103.81% 


Job Shop H3 


52.02 s 


162.5% 


102.8% 


Job Shop H5 


20.88 s 


153.2% 


107.0% 


Job Shop ABZ1-5 


2319.00 s 


103.7% 


100.1% 


Job Shop mtlO 


2181.00 s 


104.5% 


99.9% 



Fig. 5: Experimental results 



6.1 MCP 

This work directly extends our earlier work on Monadic Constraint Programming (MCP) 
[ ]. MCP introduces stackable search transformers, which are a simple form of search 
combinators, but only provide a much more limited and low level form of search control. In 
trying to overcome its limitations we arrived at search combinators. 



6.2 Constraint Logic Programming 



Constraint logic programming languages such as ECLiPSe [ ] and SICStus Prolog [25] 
provide programmable search via the built-in search of the paradigm, allowing the user to 
define goals in terms of conjunctive or disjunctive sub-goals. 

Prolog's limitation is that it does not permit cross-cutting between goals. For instance, 
disjunctions inside goals are too well encapsulated to observe them or interfere with them 
from outside that goal. Hence, combinators that inject additional behavior in disjunctions, 
i.e. to observe and/or prune the number of branches, cannot be expressed in a modular 
way. In contrast, cross-cutting is a crucial feature of our combinator approach, where a 
combinator higher up in the stack can interfere with a sub-combinator, while remaining 
fully compositional. In summary, apart from conjunction and disjunction, Prolog's goal- 
based heuristics cannot be combined arbitrarily. 

ECLiPSe copes with this limitation by combining a limited number of search heuristics 
into a monolithic search/ 6 predicate. With various parameters the user controls which of 
the heuristics is enabled (e.g., depth-bounded, node-bounded or limited discrepancy search). 
A fixed number of compositions are supported, such as changing strategy when the depth 
bound finishes. The labeling itself is user programmable. If a user is not happy with the set 
of supported heuristics in sear ch/ 6, he has to program his own from scratch. 

IBM ILOG CP Optimizer [ ] supports Prolog-style goals in C++ [ ], and like Prolog 
goals, these do not support cross-cutting. 
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6.3 The Comet Language 

The Comet [ ] system features fully programmable search [ ], built upon the basic con- 
cept of continuations, which make it easy to capture the state of the solver and write non- 
deterministic code. 

The Comet library provides abstractions like the non-deterministic primitives try and 
try all that split the search specification in two (orthogonal) parts: 1) the specification 
of the search tree which corresponds to our to our basesearch heuristics, and 2) the ex- 
ploration of that search tree by means of a search controller. In terms of our approach, 
the search controller determines both the queueing strategy and the behavior of the search 
heuristic (minus the base search) within a single entity. In other words, it defines what to do 
when starting or ending a search, failing, or adding a new choice. 

Complex heuristics can be constructed as custom controllers, either by inheriting from 
existing controllers or implementing them from scratch. 

Albeit at a different level of abstraction (e.g., compare the Comet definition of depth- 
bounded search in Figure 6 to the combinator definition dbs(n,.s) = limit(depth < n,s)), 
search controllers are quite similar to combinators as presented in this article. However, 
there is one essential difference. Our combinators are meant to be compositional, whereas 
search controllers are not. This difference in spirit is clearly reflected in 1) the design of the 
interface and its associated protocol, and 2) the instances: 

1 . The design of search controllers is simpler than that of search combinators because it 
does not take compositionality into account. While many of the messages in the two ap- 
proaches are similar in spirit, the search combinator approach also stipulates the flow of 
messages within a search combinator composition. Notably, while most of the messages 
propagate top-down through a combinator stack, it is vital to compositionality that the 
exit message proceeds in a bottom-up manner. For instance, this bottom-up flow en- 
ables the inner and combinator in the composition and(and(ji, .$2), S3) to intercept leaf 
nodes of s\ and start S2 before its parent starts £3. The other way around would clearly 
exhibit an undesirable semantics. 

In Comet, this compositional protocol is entirely absent. All messages are directed at 
the single search controller. 

2. In terms of instantiation, because of their compositional nature, we promote many "small" 
combinator instances that each capture a single primitive feature. This approach provides 
us with a high-level modeling language for search, as the primitive combinators are con- 
veniently assembled into many different search heuristics. In contrast, all Comet search 
controller instances we are aware of 2 are essentially monolithic implementations of a 
particular search heuristic; none of them takes other search controllers as arguments. 
Through a common abstract base class the instances share some basic infrastructure, but 
to implement a new search controller one basically starts from scratch. 

The fact that search controllers have not been designed with compositionality in mind 
obviously does not mean that compositionality cannot be achieved in Comet. On the con- 
trary, we believe that it is most easily achieved by integrating search controllers with the 
compositional design of our search combinators. In fact, because of Comet's powerful prim- 
itives for non-determinism, this would lead to a particularly elegant implementation. 



i.e., those published in papers and shipped with the Comet library. 
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class DBS extends AbstractSearchController { 
stack {Continuation} s; 
int limit; 
DBS (SearchSolver so, int n} : AbstractSearchController (so) { 

s = new stack{ Continuation} () ; 

limit = n; 
} 
void startTry () { 

if (s.getSizef) > limit) fail () ; 
} 
void acldChoice (Continuation f) { 

s .push (f ) ; 
} 
void fail ( ) { 

if (s .empty () ) 
exit () ; 

else 
call (s .pop () ) ; 
} 



Fig. 6: Definition of depth-bounded search in Comet. 



6.4 Other Systems 

The Salsa [ ] language is an imperative domain-specific language for implementing search 
algorithms on top of constraint solvers. Its center of focus is a node in the search process. 
Programmers can write custom Choice strategies for generating next nodes from the current 
one; Salsa provides a regular-expression-like language for combining these Choices into 
more complex ones. In addition, Salsa can run custom procedures at the exits of each node, 
right after visiting it. We believe that Salsa's Choice construct is orthogonal to our approach 
and could be incorporated. Custom exit procedures show similarity to combinators, but no 
support is provided for arbitrary composition. 

Oz [ ] was the first language to truly separate the definition of the constraint model 
from the exploration strategy [ ]. Computation spaces capture the solver state and the pos- 
sible choices. Strategies such as DFS, BFS, LDS, Branch and Bound and Best First Search 
are implemented by a combination of copying and recomputation of computation spaces. 
The strategies are monolithic, there is no notion of search combinators. 

Zinc/MiniZinc [13, 14] lets the user specify search in its annotation language. There is 
a proposal for a more expressive search language for MiniZinc [ ], but it is limited to basic 
variable ordering and domain splitting strategies. For Zinc, a language extension is available 
for implementing variable selection and domain splitting [ ] but again it does not address 
more than basic search. 



6.5 Autonomous Search 

Autonomous search (AS) [ ] addresses the challenge of providing complex application- 
tailored search heuristics in a different way. Rather than leaving the specification and tuning 
of search heuristics to the programmer, AS promotes systems that autonomously self-tune 
their performance while solving problems. Hence, while search combinators make writing 
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search heuristics easier, AS takes it out of the hands of the programmer altogether. Well- 
known instances of this approach are Impact Based Search [18] or the weighted degree 
heuristic [ ]. 

AS has advantages for 1) smaller problems where it produces a decent heuristic without 
programmer investment, and for 2) novice users who don't know how to obtain a decent 
heuristic. However, loss of programmer control is a liability for hard problems where AS 
can be ineffective and often only expert knowledge makes the difference. 



7 Conclusion 

We have shown how combinators provide a powerful high-level language for modeling com- 
plex search heuristics. To make this approach useful in practice, we devised an architecture 
in which the modularity of the language is matched by the modularity of the implementation. 
This relieves system developers from a high implementation cost and yet, as our experiments 
show, imposes no runtime penalty. 

For future work, parallel search on multi-core hardware fits perfectly in our combina- 
tor framework. We have already performed a number of preliminary experiments and will 
further explore the benefits of search combinators in a parallel setting. We will also explore 
potential optimizations (such as the short-circuit of exhonce from Sect. 5) and different 
compilation strategies (e.g., combining the static and dynamic approaches from Sect. 4). 

In addition we consider to apply search combinators in other problem domains like 
Mixed Integer Programming (MIP) and A* where search strategies have a major impact on 
performance and no dominant default search exists. Finally, combinators need not necessar- 
ily be heuristics that control the search. They may also monitor search, e.g., by gathering 
statistics or visualizing the search tree. 
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